smoke_lm <- lm(weight ~ weeks * habit, data = ncbirths)
get_regression_table(smoke_lm)Before…
Now…
How?
Offsets!
# A tibble: 4 × 3
term estimate std_error
<chr> <dbl> <dbl>
1 intercept -5.94 0.484
2 weeks 0.341 0.013
3 habit: smoker -1.86 1.63
4 weeks:habitsmoker 0.039 0.042
The * means the variables are interacting!
What is the regression equation for non-smoker mothers?
What is the regression equation for smoker mothers?
What if we have a second numerical explanatory variable?
Multiple slopes
# A tibble: 3 × 3
term estimate std_error
<chr> <dbl> <dbl>
1 intercept -6.68 0.492
2 weeks 0.346 0.012
3 mage 0.02 0.006
How do you interpret the value of 0.346?
How do you interpret the value of 0.02?
But how do we decide if the interaction model is “best” without a p-value??????
When investigating if a relationship differs…
Always start with the “interaction” / different slopes model.
If the slopes look different, you’re done!
If the slopes look similar, then fit the “additive” / parallel slopes model.
Different Enough?
Behind the Plot
geom_smooth() allows for both the intercepts and the slopes to differ
What about now?
# A tibble: 6 × 3
term estimate std_error
<chr> <dbl> <dbl>
1 intercept 594. 13.3
2 perc_disadvan -2.93 0.294
3 size: medium -17.8 15.8
4 size: large -13.3 13.8
5 perc_disadvan:sizemedium 0.146 0.371
6 perc_disadvan:sizelarge 0.189 0.323
🤨
Who is baseline?
Deciphering groups – Small schools
\[\widehat{SAT}_{small} = 594 - 2.93 \times \text{percent disadvan}\]
Deciphering groups – Medium schools
\[\widehat{SAT}_{medium} = (594 - 17.8) + (- 2.93 + 0.146) \times \text{percent disadvan}\]
\[\widehat{SAT}_{medium} = 576.2 - 2.784 \times \text{percent disadvan}\]
Deciphering groups – Large schools
\[\widehat{SAT}_{large} = (594 - 13.3) + (- 2.93 + 0.189) \times \text{percent disadvan}\]
\[\widehat{SAT}_{medium} = 580.7 - 2.741 \times \text{percent disadvan}\]
What if they’re not very different?
Parallel Slopes
# A tibble: 4 × 3
term estimate std_error
<chr> <dbl> <dbl>
1 intercept 588. 7.61
2 perc_disadvan -2.78 0.106
3 size: medium -11.9 7.54
4 size: large -6.36 6.92
Group equations – Baseline
# A tibble: 4 × 3
term estimate std_error
<chr> <dbl> <dbl>
1 intercept 588. 7.61
2 perc_disadvan -2.78 0.106
3 size: medium -11.9 7.54
4 size: large -6.36 6.92
\[\widehat{SAT}_{small} = 588 - 2.78 \times \text{percent disadvantaged}\]
Group equations – Offsets
# A tibble: 4 × 3
term estimate std_error
<chr> <dbl> <dbl>
1 intercept 588. 7.61
2 perc_disadvan -2.78 0.106
3 size: medium -11.9 7.54
4 size: large -6.36 6.92
\[\widehat{SAT}_{medium} = (588 - 11.9) - 2.78 \times \text{percent disadvan}\]
\[\widehat{SAT}_{medium} = 576.1 - 2.78 \times \text{percent disadvan}\]
\[\widehat{SAT}_{large} = (588 - 6.36) - 2.78 \times \text{percent disadvan}\]
\[\widehat{SAT}_{large} = 581.64 - 2.78 \times \text{percent disadvan}\]